Registering clock driver controlled decision feedback equalizer training process

ABSTRACT

A method is described. The method includes receiving from a memory controller configuration information for a testing sequence and storing the configuration information in configuration register space of the driver circuit. The method also includes controlling the next testing sequence. The testing sequence includes sweeping values of a tap coefficient of a DFE circuit of the driver circuit and sweeping a voltage of a slicer of the driver circuit. The method includes sending results of the testing sequence to the memory controller. The results are to determine a value for the DFE tap coefficient.

RELATED CASES

This application is a continuation of and claims the benefit of U.S.patent application Ser. No. 15/716,460, entitled, “REGISTERING CLOCKDRIVER CONTROLLED DECISION FEEDBACK EQUALIZER TRAINING PROCESS”, filedSep. 26, 2017, which is incorporated by reference in its entirety.

FIELD OF INVENTION

The field of invention pertains generally to computing systems, and,more specifically, to a registering clock driver controlled decisionfeedback equalizer training process.

BACKGROUND

The performance of a computing system is heavily dependent on thecomputing system's system memory (also referred to as main memory). Assuch, system designers are highly motivated to improve the performanceof system memory. However improving the performance of system memory canentail the inclusion of sensitive circuits that require training andconfiguration. Reducing the overhead of the training and configurationis a design goal otherwise, e.g., too much time will be spent waitingfor the memory system to be operable.

FIGURES

A better understanding of the present invention can be obtained from thefollowing detailed description in conjunction with the followingdrawings, in which:

FIG. 1 shows a memory system;

FIG. 2 shows a registering clock driver circuit;

FIG. 3 shows an improved registering clock driver circuit;

FIG. 4 shows a decision feedback equalizer (DFE) receiver circuit;

FIG. 5 shows a DFE training process;

FIG. 6 shows a DFE circuit of the improved registering clock drivercircuit of FIG. 3;

FIG. 7 shows an improved memory controller;

FIG. 8 shows a method performed by the improver registering clock drivercircuit of FIG. 3;

FIG. 9 shows a computing system.

DETAILED DESCRIPTION

FIG. 1 shows a schematic diagram of a system memory implementation 100.As observed in FIG. 1, the system memory implementation includes amemory controller 101 on the host side. The memory controller 101includes a double data rate (DDR) memory bus interface 102 that iscoupled to the wiring of a DDR memory bus this is disposed, e.g., on themotherboard of a computing system.

A pair of dual in line memory modules 104_1, 104_2 are coupled to thememory bus. The memory bus includes N control signal wires (where N isan integer) that together are referred to as the Command Address “CA”bus 103. The control signals that are carried on the CA bus 103 include,to name a few, a row address strobe signal (RAS), column address strobesignal (CAS), a write enable (WE) signal and a plurality of address(ADDR) signals.

As observed in FIG. 1, the control signals 103 are intercepted by aregistering clock driver (RCD) circuit 106_1, 106_2 on each of the DIMMs104_1, 104_2. For ease of illustration, FIG. 1 indicates that the chipselect (CS) signals are not intercepted by the RCD circuits of therespective DIMM card that they are directed to. However, in variousimplementations, the RCD circuits receive and redrive the CS signals oftheir respective DIMM card.

FIG. 2 shows a more detailed view of an RCD circuit 206. As observed inFIG. 2, the RCD circuit 206 includes redriver circuitry 207 to redrivethe N CA control signals on the CA bus 203 to multiple loads on the DIMM(specifically, the Q different memory chips on the DIMM where each rankon the DIMM includes Q/2 memory chips). The RCD circuit 206 may becharacterized as a form of driver circuit because of its signal drivingfunction. Here, the control signals that are transported on the CA bus203 are high speed signals that might otherwise suffer debilitatingdistortion without the help of the RCD circuit 206. That is, as thecontrol signals propagate through the connector 208 that connects theDIMM to the motherboard, they may suffer various forms of signaldistortion such as loss of high frequency components and varieddelay/skew.

The RCD circuit 206 essentially samples these signals as they emergethrough the connector 208 and the redrive circuitry 207 redrives fresh,“cleaner” digital pulses to the Q memory chips 209 on the DIMM 204.Additionally, as can be seen from FIG. 2, the RCD circuit 206 alsoprovides a single point of termination for each of the control signalsafter they emerge from the connector 208. The single point oftermination essentially eliminates any further corruption of thesesignals that would otherwise occur if each control signal were tocontinue to propagate to the multiple Q loads that each of the memorychips on the DIMM impose.

As is known in the art, adequately driving multiple loads is moredifficult to achieve, particularly for high speed signals. As such, theRCD circuit 206 removes this challenge from the signals that emergedirectly from the connector 208 by terminating them with only a singleload at the front end of the RCD circuit 206 (which is not depicted inFIG. 2 for illustrative ease) and redriving these same signals with theredrive circuitry 207 to each of the Q multiple loads. The redrivecircuitry 207 may be composed, e.g., of registers that latch the controlsignals in parallel and then redrives them on a following clock edge soas to remove any skew that may have been present across the set ofincoming signals that were sampled.

With the development of even higher speed system memory bustechnologies, future RCD circuits will need to include sophisticatedreceiving circuitry (rather than just a traditional termination) at thefront end of the RCD circuit in order to ensure that the control signalsare properly received.

FIG. 3 shows one approach in which a decision feedback equalizer (DFE)circuit 310_1 through 310_N is placed at the front end of the RCDcircuit 306 along each incoming high speed control signal path in orderto properly determine the digital information stream of each signalbefore it is presented to the redrive circuitry 307.

FIG. 4 shows an embodiment of a decision feedback equalizer (DFE)circuit 410. As observed in FIG. 4, a DFE circuit 410 includes asummation circuit 411 at its front end, a decision slicer circuit 412 atits back end and a feedback path 413 that includes multiple taps foreach of multiple previously sampled digital bit values. A DFE circuit410 is a circuit that is very well suited to remove inter symbolinterference (ISI) which is a highly problematic form of distortion forhigh speed digital signal streams. Very high speed digital pulses tendto lose their sharp rectangular shape as they propagate along a signaltrace resulting in more drawn out, wider, rounder shapes by the timethey are received at the receiving end. The wider rounded pulses canextend into the time slot of neighboring pulses. The pulse shape of anypulse can therefore become even more corrupted by the interferingwaveforms of its neighboring pulses in the digital stream.

DFE circuits are circuits that are well suited to remove this specifickind of distortion because each “tap” in the feedback path 413 can bespecially tuned to remove the interference from a particular pulse thathas preceded (was received ahead of) the current pulse whose digitalvalue is presently being determined.

Here, the decision slicer 412 determines the digital value (1 or 0) ofthe digital pulse that is currently being received. A first feedbackpath 414_1 that flows from a first tap along the feedback path 413includes a first coefficient W₁ that is specially configured to removeany interference from the digital pulse that immediately preceded thedigital pulse that is currently being received. Here, the coefficient W₁attempts to capture the amplitude (or amount) of the immediately priorpulse that interferes with the current pulse and the summation circuit411 then subtracts this particular amplitude from the currently receivedsignal. The subtraction ideally completely removes any/all interferencefrom the immediately prior pulse with the pulse that is currently beingreceived.

Other feedback paths from the other taps along the feedback path behavesimilarly except their coefficients are tuned for different, respectiveearlier received pulses (path n−1 ideally removes interference from thepulse that was received n−1 time slots ahead of the current pulse; pathn ideally removes interference from the pulse that was received n timeslots ahead of the current pulse). Thus, the DFE circuit 410 of FIG. 4is designed to remove any interference from the n pulses that werereceived ahead of the current pulse. The remaining discussion assumes aDFE that is designed to remove ISI from the four preceding pulses (n=4).However, removal of interference from four preceding pulses is onlyexemplary (other DFE implementations may be designed to remove theinterference from more or less than four preceding pulses).

Referring back to the improved RCD circuit 306 of FIG. 3 which includesa DFE circuit 310_1 through 310_N along the path of each incoming highspeed CA control signal, a potential problem is the amount of time thatwill be consuming determining the appropriate coefficient value W_(i)for each tap of each feedback path of every DFE circuit. Here, theamount of ISI that a particular pulse will receive is highly dependenton its specific channel. As such, the tap coefficient settings for eachof the DFE circuits are apt to different (each DFE circuit may have itsown unique set of tap coefficient settings).

The appropriate tap coefficient settings are determined, e.g., duringsystem boot-up with a series of test patterns that are propagatedthrough each high speed CA control signal trace. Training circuitry thenanalyzes the DFE's behavior in response to the patterns and determinesappropriate tap coefficient settings for the DFE. A more detaileddiscussion of a specific training embodiment is provided in more detailfurther below.

The improved RCD also includes a training controller 321, trainingconfiguration space 322 and DFE coefficient register space 323. Thetraining controller 321 performs various tasks to support the trainingroutine that determines the correct tap coefficients for each of the DFEcircuits 310_1 through 310_N.

FIG. 5 shows an embodiment of a flow diagram 500 of the trainingsequence. As can be seen from the flow diagram 500, the trainingsequence involves the cooperation of both the host side memorycontroller and the RCD circuit. In various implementations, thefunctions that are performed by the RCD circuit during the trainingsequence are controlled by the training controller 321 of FIG. 3.

Initially, the host side memory controller determines 501 and sends 502configuration information for a first round of training tests. As willbecome more clear in the following discussion, the program flow can beseen as an outer loop 504 in which training is being performed for aparticular DFE tap coefficient value, and where, the training entailsincrementing the value for the particular DFE tap coefficient on eachouter loop iteration 504 so as to affectively “sweep” through a numberof different values for the particular DFE tap coefficient. In variousembodiments, the memory controller determines 507 the minimumcoefficient tap value, the maximum coefficient tap value and theincrement that is to be applied to the coefficient tap value on eachouter loop iteration. That is, the memory controller determines 507 thevalues for the coefficient value that are to be swept over by way of theouter loop iterations 504. The RCD receives this information and storesit 509 in the testing configuration register space 322.

Moreover, the memory controller also determines 507 a set of testparameters for an inner loop 503 in which reference and/or thresholdvoltages for the DFE are swept through. Here, the memory controllerdetermines 507 a minimum reference/threshold voltage, a maximumreference/threshold voltage and an increment to be applied to thereference/threshold voltage with each inner loop iteration 503. That is,the memory controller also determines 507 the values for the DFEreference/threshold voltage that are to be swept over by way of theinner loop iterations 503. The RCD also receives this information andstores it 509 in the testing configuration register space 322.

With the testing parameters for the complete training of a particularcoefficient tap value being stored in the RCD's testing register space322, the RCD circuit's test controller 321 is able to take overresponsibility for implementing and overseeing the training for thecoefficient tap value. Having the training controlled from the RCD (byway of the test controller 321) is pertinent because it greatly reducescommunication/traffic between the RCD and the memory controller ascompared to a solution in which the memory controller was responsiblefor the primary oversight of the training.

FIG. 6 shows a DFE circuit 610 and associated circuitry 614 that isspecially designed to support the training exercises that are performedduring the outer loop 504 and inner loop 503 training iterations. Here,the feedback path 613 is designed such that its different tapcoefficient values (W₁, W₂, . . . W_(n-1)) are configurable inputparameters (i.e, the respective values for the different tapcoefficients can be programmed into the DFE circuit (e.g., by beingentered into registers associated with the feedback path)).

Thus, in various embodiment, nominal execution of an outer loop 504entails configuring the tap coefficient to be trained with a first ofits sweep values (e.g., the minimum or maximum value for the tapcoefficient's sweep range as just provided by the memory controller),and, setting the reference/threshold voltage to its initial sweep valueto setup the inner loop for the first outer loop iteration.

With respect to the inner loop sweep, referring to FIG. 6, a variablevoltage divider circuit 614 provides a reference voltage input to themain decision slicer circuit 612. As is known in the art, various signaltraces of a memory bus (such as the high speed CA control signal linesof a high performance DDR memory bus) are designed to include areference voltage V_(REF). In various embodiments, the reference voltageV_(REF) is ideally set midway between a logic high voltage and a logiclow voltage that is propagated on the signal line (also referred to as a“trace”). In various embodiments, the variable voltage divider circuit614 enables the reference voltage V_(REF) on the incoming signal traceto be adjusted and is one of the parameters that can be varied duringthe inner loop training 503 of the DFE. Here, in various embodiments,the main decision slicer 612 determines whether the input control signalis a logical 1 or 0 by comparing the control signal against thereference voltage.

Additionally, the output of the summation circuit 611 is coupled notonly to the main decision slicer 612 of the DFE circuit 601 but also toa second instantiation of the decision slicer 615 that is used duringthe inner loop training 503. Here, the second decision slicer 615includes a variable threshold voltage V_(TH). As is known in the art, adecision slicer decides whether an incoming voltage waveform is a 1 or a0 by comparing the incoming voltage waveform against its internalthreshold voltage V_(TH) (if the waveform is greater than the thresholdvoltage the incoming signal is deemed to be transporting a 1, if thewaveform is less than the threshold voltage the incoming signal isdeemed to be transporting a 0).

By varying the second slicer's threshold voltage, the voltage height ofthe “eye pattern” of an incoming test pattern signal can be determined.Specifically, during inner loop training 503, a test pattern signal issent from the host side (e.g., the system memory controller of thecomputing system) to the DFE being trained. The signal is processed bythe main decision slicer 612 and the second slicer 615. The decisionsmade by both slicers 612, 615 are then provided to a comparator 616 andcompared. If the voltage threshold of the second slicer 615 is raisedhigh enough that it approaches the voltage height of the incomingwaveform itself, the second slicer 615 will regularly interpret thewaveform incorrectly which will result in a dramatic difference betweenthe output values generated by the main slicer 612 and the second slicer615 (the comparison of the two outputs by the comparator 616 will resultin many differences). At this point, the height of the eye pattern isapproximately reached.

Likewise, if the voltage threshold of the second slicer 615 is loweredlow enough such that it approaches the voltage floor of the incomingwaveform it will also regularly interpret the waveform incorrectly whichwill result in a dramatic difference between the output values generatedby the main slicer 612 and the second slicer 615 (the comparison of thetwo outputs by the comparator 616 will result in many differences). Atthis point, the floor of the eye pattern is approximately reached. Thus,the approximate voltage HI and LO voltage levels of the incomingwaveform can be measured by adjusting the internal threshold voltage ofthe second slicer 615.

Thus, referring to FIG. 5, in various embodiments, inner loop training503 entails sweeping the threshold voltage V_(TH) of the second decisionslicer 615 while the outer loop value for the tap coefficient that isbeing trained is fixed at the current outer loop value. That is, for asingle outer loop 504 iteration, the inner loop 503 runs through anumber of iterations to sweep the threshold voltage V_(TH). Once theinner loop sweeping 503 is completed, the outer loop 504 advanced to anext iteration 506 and the tap coefficient tap value is incremented 506to its next sweep value. At this point, training for the new incrementedtap value can begin and the inner loop sweeping 503 of the thresholdvoltage is repeated.

Returning back to a discussion of the execution of the inner loopsweeping 503, once a next V_(TH) value is set for the current inner loopiteration, the host side/memory controller sends a test pattern 504 thatincludes a sufficient number of sample pulses. The decisions from bothslicers 612, 615 are compared by the comparator 616 and the results ofthe comparison, in one embodiment, are stored in the RCD circuit.

The internal threshold voltage V_(TH) of the second slicer 615 is thenadjusted by some increment (e.g., a tenth of a volt, a hundredth of avolt, etc.) away from is prior setting (e.g. the internal thresholdvoltage is lowered from the prior maximum setting to the maximum settingless the increment) to advance the next iteration of the inner loop 503.The host/memory controller then sends another test pattern 504 and thecomparison from both slicer outputs is stored in the RCD. The internalthreshold voltage of the second slicer 615 is then further adjustedagain by the increment (e.g., the internal voltage is lowered to themaximum setting less two increments), another test pattern 504 is sentand the comparison results are stored in the RCD circuit. The processthen repeatedly continues until a full “sweep” of the internal thresholdvoltage of the second slicer 515 has been achieved (e.g., after a finaliteration, the lowering of the internal threshold voltage by theincrement causes the internal threshold voltage to reach its minimumvoltage) which completes the inner loop 503.

With the completion of the inner loop measurements 503, the RCDcircuitry sends 505 the comparison results to the host memorycontroller. In an alternative implementation, rather than storing thecomparison results in the RCD with each inner loop iteration andreporting comparison results only once per outer loop iteration,instead, comparison results are more regularly sent to the memorycontroller. For example, comparison results are sent with each innerloop iteration or some other interval that is more frequent than onceper outer loop iteration (e.g., one feedback signal describingcomparison results is sent per N (e.g., 8 unit interval) test patternssent by the memory controller). Accordingly, with greater feedbackfrequency, the communication behavior between the RCD and the memorycontroller is more interactive during test pattern training runtime.

The host memory controller then studies the results and determines 501the value for particular coefficient tap that the inner loop wasexecuted for (possible other, better estimates for the remaining tapcoefficients of the DFE circuit are also determined). The host thensends 502 the newly determined value(s) to the RCD circuit which stores508 the value(s) in the general configuration register space 323 of theRCD circuit. Going forward, the value stored in the general registerspace 323 will be used to program the DFE.

With the determined value being entered in general register space 323,the program flow transitions to determining the test configuration forthe next outer loop (the next coefficient tap value). The memorycontroller determines and sends 507 to the RCD circuit the trainingparameters for the next outer and inner loop sequences (the range andincrement for the coefficient tap sweep and the range and increment forthe threshold voltage sweep). The RCD circuit then stores 509 theseparameters in the RCD's testing configuration register space 322.

The training process then transitions to execution of the outer loop 504for the new parameter (e.g., the next tap coefficient). In performingthe outer loop 504 and its associated inner loop 503, the RCD controller321 refers to the parameters that were stored in the trainingconfiguration space 322 so that RCD controller performs the outer looptap coefficient sweeps and inner loop V_(TH) sweeps with the respectiveincrements and across the ranges that the memory controller provided507.

The process then iterates until all coefficient taps for the DFE havebeen determined. As discussed above, in various embodiments, there isone complete outer loop sweep for each tap coefficient in the DFE. Aftercompletion of these outer loop sweeps, a coefficient will have beendetermined for each feedback path tap and stored in the RCD circuit'sgeneral configuration register space 323.

The training process then moves on to a next DFE circuit where theentire process is repeated. After training has been completed for allthe DFEs, the DFE training process is complete for the RCD.

It is pertinent to point out that in various embodiments there is onemore outer loop sequence performed by the process beyond the number ofcoefficient taps (e.g., if there are four coefficient taps, five outerloops are performed). Here the inner loop testing for this additionalouter loop iteration includes the sending of generic test patterns(e.g., unit interval test patterns) with generic/standard sweep rangesand increments. The comparison data from these sweeps is then sent tothe host/memory controller which determines an initial “first stabguess” at the values for each of the coefficient taps in the DFE circuitand sends these coefficients to the RCD. The DFE's coefficients are thenprogrammed 508 with these values.

The next outer loop then corresponds to determination of the firstcoefficient tap. The inner loop sweeps 503 that are performed for thefirst coefficient tap are performed with the initial coefficient tapsettings being programmed in the DFE. After the completion of the innerloop 503 and the sending of the comparison results 505 to thehost/memory controller for analysis, the host/memory controllerdetermines 501 a newly determined first tap coefficient value and sends502 the newly determined first tap coefficient to the RCD which isprogrammed 508 into the DFE (and replaces the “first stab guess” valuefor the first tap coefficient). The newly determined first tapcoefficient value is then used by the DFE when the outer loop trainingfor the next, second tap coefficient value is performed. The memorycontroller also determines and sends 507 a new set of outer and innerloop testing parameters for the next round of outer and inner loop tests(for the second tap coefficient). The RCD stores 509 these values intesting configuration space 322 and then commences the next outer 504and inner loop testing 503.

In further embodiments, the host side/memory controller not onlyprovides a coefficient value for the first tap at the completion of theinner loop training for the first coefficient tap but also provide newvalues for the other coefficient taps to replace the “first stab guess”values for these coefficients as well. The set of coefficient values arethen programmed into the DFE and used during the outer and inner looptraining that is performed for the next, second coefficient tap value.At the completion of this training, the host at least returns a valuefor the second coefficient tap and may return a new set of coefficienttaps for the DFE (e.g., including, possibly, a new coefficient tap valuefor the first tap). Regardless of how many new coefficient values aredetermined per completion of an outer loop, after outer loop traininghas been executed for each of the tap coefficients, the training of theDFE in various embodiments will be complete.

In various embodiments, the line reference voltage V_(REF) may be variedinstead of the second slicer threshold voltage V_(TH). In yet otherembodiments a deeper inner loop dimension may be attempted where theouter loop sweeps coefficient tap value (as discussed at length above),a first inner loop sweeps one of V_(TH) and V_(REF) and a second, deeperinner loop sweeps the other of V_(TH) and V_(REF). Here, increment ofthe first inner loop voltage does not occur until the deepest (second)inner loop voltage has been fully swept.

In various embodiments, inner loop training can be performed in a mannerthat does not sweep any adjustable reference voltages. Here, forinstance, the host may specify a NULL value for a particular sweep'srange and increment. In cases where a NULL is provided for the sweeps ofboth reference voltages, the host may vary another parameter of thememory bus such as, e.g., the frequency or temporal position (phase) ofthe clock that the incoming signal that the DFE determines the value ofis timed according to. Here, again, when the comparison information fromboth slicers is dramatically different, the frequency/phase of the clockwill be understood to be at a value for signal interpretation isunreliable.

FIG. 7 shows an embodiment of a memory controller 701 that can performthe DFE training process of FIG. 5 in cooperation with the improved RCDcircuit 306 of FIG. 3. As observed in FIG. 7, the memory controller 701includes a training controller 731 to oversee and control the memorycontroller's operations during the training. The memory controller alsoincludes DFE tap determination intelligence 732 and sweep determinationintelligence 733. The DFE tap determination intelligence 732 iscircuitry that determines, e.g., one or more DFE tap coefficient valuesin view of comparison results that were provided by the RCD aftertraining. The sweep determination intelligence 733 determines ranges andincrements for both line reference voltage V_(REF) and second slicerinternal threshold voltage V_(TH) sweeps.

The training controller 731 may be implemented with any combination ofdedicated logic circuitry (such as state machine logic circuitry thatimplements a control state machine for the training) or logic circuitrythat executes some form of program code (e.g., an embedded controller orprocessor that executes training firmware) to perform the controlfunctions for the training performed by the memory controller 701. TheDFE tap and sweep determination intelligence 732, 733 may likewise beimplemented with any combination of look-up table circuitry (e.g.,embedded memory) and/or dedicated logic and/or program code executionlogic to look-up and/or explicitly calculate DFE tap coefficients andappropriate sweep parameters. The memory controller also includes testpattern generation logic 734 to generate the test patterns that are usedto determine the DFE tap coefficients.

Referring back to FIG. 3, the training controller 321 of the RCD circuit306 may also be implemented with any combination of dedicated logiccircuitry (such as state machine logic circuitry that implements acontrol state machine for the training) or logic circuitry that executessome form of program code (e.g., an embedded controller or processorthat executes training firmware) to perform the control functions forthe training performed by the RCD. The training controller 321 iscoupled to feedback path circuitry of the DFEs to set tap coefficientvalues. Likewise, the training controller 321 is also coupled to theadjustable reference voltage circuitry 614 and the threshold circuitryof the second slicer 615 to implement respective sweeps of the linereference voltage V_(REF) and/or the internal threshold voltage V_(TH).

In various embodiments the control signals that are received with DFEcircuits that are trained according to the procedures describe above,and the memory controller and RCD circuit that perform the training, areeach compatible with the specifications of a published industry standardsuch as a Joint Electron Device Engineering Council (JEDEC) DDR memorybus standard (e.g., a DDRS JEDEC standard).

FIG. 8 shows a method performed by a driver circuit as described above.The method includes receiving from a memory controller configurationinformation for a testing sequence and storing the configurationinformation in configuration register space of the driver circuit 801.The method also includes controlling the testing sequence, where, thetesting sequence includes sweeping values of a tap coefficient of a DFEcircuit of the driver circuit and sweeping a voltage of a slicer of thedriver circuit 802. The method also includes sending results of thetesting sequence to the memory controller, where, the results are todetermine a value for the DFE tap coefficient 803.

FIG. 9 provides an exemplary depiction of a computing system 900 (e.g.,a smartphone, a tablet computer, a laptop computer, a desktop computer,a server computer, etc.). As observed in FIG. 9, the basic computingsystem 900 may include a central processing unit 901 (which may include,e.g., a plurality of general purpose processing cores 915_1 through915_X) and a main memory controller 917 disposed on a multi-coreprocessor or applications processor, system memory 902, a display 903(e.g., touchscreen, flat-panel), a local wired point-to-point link(e.g., USB) interface 904, various network I/O functions 905 (such as anEthernet interface and/or cellular modem subsystem), a wireless localarea network (e.g., WiFi) interface 906, a wireless point-to-point link(e.g., Bluetooth) interface 907 and a Global Positioning Systeminterface 908, various sensors 909_1 through 909_Y, one or more cameras910, a battery 911, a power management control unit 912, a speaker andmicrophone 913 and an audio coder/decoder 914.

An applications processor or multi-core processor 950 may include one ormore general purpose processing cores 915 within its CPU 901, one ormore graphical processing units 916, a memory management function 917(e.g., a memory controller) and an I/O control function 918. The generalpurpose processing cores 915 typically execute the operating system andapplication software of the computing system. The graphics processingunit 916 typically executes graphics intensive functions to, e.g.,generate graphics information that is presented on the display 903. Thememory control function 917 interfaces with the system memory 902 towrite/read data to/from system memory 902. The power management controlunit 912 generally controls the power consumption of the system 900.

Each of the touchscreen display 903, the communication interfaces904-707, the GPS interface 908, the sensors 909, the camera(s) 910, andthe speaker/microphone codec 913, 914 all can be viewed as various formsof I/O (input and/or output) relative to the overall computing systemincluding, where appropriate, an integrated peripheral device as well(e.g., the one or more cameras 910). Depending on implementation,various ones of these I/O components may be integrated on theapplications processor/multi-core processor 950 or may be located offthe die or outside the package of the applications processor/multi-coreprocessor 950.

The computing system may also include a memory system, such as systemmemory (also referred to as main memory) implemented with a memorycontroller and RCD circuitry that cooperatively perform DFE training formemory bus control signals as described at length above. Here,operations of the testing procedure may, e.g., be stored in non volatilememory or storage as firmware program code of the computing system. Thefirmware is loaded into the memory controller and/or RCD during boot-upof the computing system.

Application software, operating system software, device driver softwareand/or firmware executing on a general purpose CPU core (or otherfunctional block having an instruction execution pipeline to executeprogram code) of an applications processor or other processor mayperform any of the functions described above.

Embodiments of the invention may include various processes as set forthabove. The processes may be embodied in machine-executable instructions.The instructions can be used to cause a general-purpose orspecial-purpose processor to perform certain processes. Alternatively,these processes may be performed by specific hardware components thatcontain hardwired logic for performing the processes, or by anycombination of programmed computer components and custom hardwarecomponents.

Elements of the present invention may also be provided as amachine-readable medium for storing the machine-executable instructions.The machine-readable medium may include, but is not limited to, floppydiskettes, optical disks, CD-ROMs, and magneto-optical disks, FLASHmemory, ROMs, RAMs, EPROMs, EEPROMs, magnetic or optical cards,propagation media or other type of media/machine-readable mediumsuitable for storing electronic instructions. For example, the presentinvention may be downloaded as a computer program which may betransferred from a remote computer (e.g., a server) to a requestingcomputer (e.g., a client) by way of data signals embodied in a carrierwave or other propagation medium via a communication link (e.g., a modemor network connection).

In the foregoing specification, the invention has been described withreference to specific exemplary embodiments thereof. It will, however,be evident that various modifications and changes may be made theretowithout departing from the broader spirit and scope of the invention asset forth in the appended claims. The specification and drawings are,accordingly, to be regarded in an illustrative rather than a restrictivesense.

What is claimed is:
 1. An apparatus, comprising: a registering clockdriver (RCD) circuit comprising a decision feedback equalizer (DFE)circuit, the DFE circuit comprising programmable DFE feedback tapcoefficients, the DFE circuit comprising a programmable referencevoltage, the DFE circuit comprising a decision circuit that determines alogical value of a signal based on the programmable reference voltage,the RCD circuit further comprising DFE training circuitry to support atraining sequence that determines appropriate values for the feedbacktap coefficients, the DFE training circuitry comprising: a) testcontroller circuitry to conduct inner test loops that sweep thereference voltage and outer test loops that sweep a particular one ofthe DFE feedback coefficient values, wherein, a next outer test loop isexecuted after all of the inner test loops have been executed, whereinthe RCD circuit is to receive a test pattern from a host in betweeninner test loop iterations, the DFE training circuitry to cause the RCDto send test results to the host so that the host is able to determinean acceptable value for the particular one of the DFE feedbackcoefficient values; b) register space to store an inner test loopreference voltage increment received from the host and an outer testloop DFE feedback coefficient value increment received from the host,the test controller circuitry to increment the reference voltage by theinner test loop reference voltage with each next inner test loopiteration, the test controller circuitry to increment the particular oneof the DFE feedback coefficient values by the outer test loop DFEfeedback coefficient value increment with each next outer test loopiteration.
 2. The apparatus of claim 1 wherein the DFE trainingcircuitry comprises an exclusive OR (XOR) circuit to compare a testpattern against DFE decisions made against the test pattern.
 3. Theapparatus of claim 2 wherein the test results are generated by the XORcircuit.
 4. The apparatus of claim 1 wherein the RCD circuit is disposedon a memory module.
 5. The apparatus of claim 4 wherein the memorymodule is plugged into a computer system.
 6. A memory controller,comprising: a dual data rate (DDR) memory interface to interface with aregistering clock driver (RCD) circuit, the RCD circuit comprising adecision feedback equalizer (DFE) circuit, the DFE circuit comprisingprogrammable DFE feedback tap coefficients, the DFE circuit comprising aprogrammable reference voltage, the DFE circuit comprising a decisioncircuit that determines a logical value of a signal based on theprogrammable reference voltage, the RCD circuit further comprising DFEtraining circuitry to support a training sequence that determinesappropriate values for the feedback tap coefficients, the memorycontroller comprising: circuitry to establish inner test loops performedby the RCD circuit that sweep the reference voltage and outer test loopsperformed by the RCD circuit that sweep a particular one of the DFEfeedback coefficient values, wherein, a next outer test loop is executedafter all of the inner test loops have been executed, the memorycontroller to receive test results from the RCD circuit and determine anacceptable value for the particular one of the DFE feedback coefficientvalues therefrom, the circuitry to determine an inner test loopreference voltage increment and an outer test loop DFE feedbackcoefficient value increment and to cause the inner test loop referencevoltage increment and the outer test loop DFE feedback coefficient valueincrement to the RCD circuit for implementation by the RCD circuit; testpattern determination circuitry, the memory controller to send a testpattern determined by the test pattern determination circuitry to theRCD circuit in between inner test loop iterations.
 7. The memorycontroller of claim 6 wherein the memory controller is within acomputing system.